Concord
FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs
Wan, Yingjia, Tan, Haochen, Zhu, Xiao, Zhou, Xinyu, Li, Zhiwei, Lv, Qingsong, Sun, Changxuan, Zeng, Jiaqi, Xu, Yi, Lu, Jianqiao, Liu, Yinhong, Guo, Zhijiang
Evaluating the factuality of long-form generations from Large Language Models (LLMs) remains challenging due to efficiency bottlenecks and reliability concerns. Prior efforts attempt this by decomposing text into claims, searching for evidence, and verifying claims, but suffer from critical drawbacks: (1) inefficiency due to overcomplicated pipeline components, and (2) ineffectiveness stemming from inaccurate claim sets and insufficient evidence. To address these limitations, we propose \textbf{FaStfact}, an evaluation framework that achieves the highest alignment with human evaluation and time/token efficiency among existing baselines. FaStfact first employs chunk-level claim extraction integrated with confidence-based pre-verification, significantly reducing the time and token cost while ensuring reliability. For searching and verification, it collects document-level evidence from crawled web-pages and selectively retrieves it during verification. Extensive experiments based on an annotated benchmark \textbf{FaStfact-Bench} demonstrate the reliability of FaStfact in both efficiently and effectively evaluating long-form factuality. Code, benchmark data, and annotation interface tool are available at https://github.com/Yingjia-Wan/FaStfact.
- North America > United States > New Jersey > Bergen County > Rutherford (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.14)
- Europe > Austria > Vienna (0.14)
- (26 more...)
- Telecommunications (1.00)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Information Technology (1.00)
- (4 more...)
Tim Berners-Lee Invented the World Wide Web. Now He Wants to Save It
In 1989, Sir Tim revolutionized the online world. Today, in the era of misinformation, addictive algorithms, and extractive monopolies, he thinks he can do it again. Berners-Lee is building tools that aim to resist the Big Tech platforms, give users control over their own data, and prevent A.I. from hollowing out the open web. Tim Berners-Lee may have the smallest fame-to-impact ratio of anyone living. Strangers hardly ever recognize his face; on "Jeopardy!," Berners-Lee invented the World Wide Web, in 1989, but people informed of this often respond with a joke: Wasn't that Al Gore? Still, his creation keeps growing, absorbing our reality in the process. If you're reading this online, Berners-Lee wrote the hypertext markup language (HTML) that your browser is interpreting. He's the necessary condition behind everything from Amazon to Wikipedia, and if A.I. brings about what Sam Altman recently called "the gentle singularity"--or else buries us in slop--that, too, will be an outgrowth of his global collective consciousness. Somehow, the man responsible for all of this is a mild-mannered British Unitarian who loves model trains and folk music, and recently celebrated his seventieth birthday with a picnic on a Welsh mountain. An emeritus professor at Oxford and M.I.T., he divides his time between the U.K., Canada, and Concord, Massachusetts, where he and his wife, Rosemary Leith, live in a stout greige house older than the Republic. On the summer morning when I visited, geese honked and cicadas whined. Leith, an investor and a nonprofit director who co-founded a dot-com-era women's portal called Flametree, greeted me at the door. "We're basically guardians of the house," she said, showing me its antique features. I almost missed Berners-Lee in the converted-barn kitchen, standing, expectantly, in a blue plaid shirt. He shook my hand, then glanced at Leith. Minutes later, he and I were gliding across a pond behind the house. Berners-Lee is bronzed and wiry, with sharp cheekbones and faraway blue eyes, the right one underscored by an X-shaped wrinkle. A twitchier figure emerged when he spoke.
- North America > United States > Massachusetts > Middlesex County > Concord (0.24)
- Europe > United Kingdom > England (0.14)
- Africa > Rwanda (0.14)
- (10 more...)
- Media > Music (1.00)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- (5 more...)
- Information Technology > Communications > Web (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Communications > Networks (1.00)
- (3 more...)
Acoustic evaluation of a neural network dedicated to the detection of animal vocalisations
Rouch, Jérémy, Ducrettet, M, Haupert, S, Emonet, R, Sèbe, F
The accessibility of long-duration recorders, adapted to sometimes demanding field conditions, has enabled the deployment of extensive animal population monitoring campaigns through ecoacoustics. The effectiveness of automatic signal detection methods, increasingly based on neural approaches, is frequently evaluated solely through machine learning metrics, while acoustic analysis of performance remains rare. As part of the acoustic monitoring of Rock Ptarmigan populations, we propose here a simple method for acoustic analysis of the detection system's performance. The proposed measure is based on relating the signal-to-noise ratio of synthetic signals to their probability of detection. We show how this measure provides information about the system and allows optimisation of its training. We also show how it enables modelling of the detection distance, thus offering the possibility of evaluating its dynamics according to the sound environment and accessing an estimation of the spatial density of calls.
- South America > French Guiana (0.04)
- North America > United States > Massachusetts > Middlesex County > Concord (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
WavePulse: Real-time Content Analytics of Radio Livestreams
Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay
Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > New York > Kings County > New York City (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (215 more...)
- Media > Radio (1.00)
- Leisure & Entertainment (1.00)
- Government > Voting & Elections (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring
Cañas, Juan Sebastián, Toro-Gómez, Maria Paula, Sugai, Larissa Sayuri Moreira, Restrepo, Hernán Darío Benítez, Rudas, Jorge, Bautista, Breyner Posso, Toledo, Luís Felipe, Dena, Simone, Domingos, Adão Henrique Rosa, de Souza, Franco Leandro, Neckel-Oliveira, Selvino, da Rosa, Anderson, Carvalho-Rocha, Vítor, Bernardy, José Vinícius, Sugai, José Luiz Massao Moreira, Santos, Carolina Emília dos, Bastos, Rogério Pereira, Llusia, Diego, Ulloa, Juan Sebastián
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians calls recorded by PAM, that comprises 27 hours of expert annotations for 42 different species from two Brazilian biomes. We provide open access to the dataset, including the raw recordings, experimental setup code, and a benchmark with a baseline model of the fine-grained categorization problem. Additionally, we highlight the challenges of the dataset to encourage machine learning researchers to solve the problem of anuran call identification towards conservation policy.
- Europe > Spain > Galicia > Madrid (0.05)
- South America > Brazil > São Paulo > Campinas (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- (6 more...)
In Defense of Humanity
On July 13, 1833, during a visit to the Cabinet of Natural History at the Jardin des Plantes, in Paris, Ralph Waldo Emerson had an epiphany. Peering at the museum's specimens--butterflies, hunks of amber and marble, carved seashells--he felt overwhelmed by the interconnectedness of nature, and humankind's place within it. Check out more from this issue and find your next story to read. The experience inspired him to write "The Uses of Natural History," and to articulate a philosophy that put naturalism at the center of intellectual life in a technologically chaotic age--guiding him, along with the collective of writers and radical thinkers known as transcendentalists, to a new spiritual belief system. Through empirical observation of the natural world, Emerson believed, anyone could become "a definer and map-maker of the latitudes and longitudes of our condition"--finding agency, individuality, and wonder in a mechanized age. America was crackling with invention in those years, and everything seemed to be speeding up as a result.
- North America > United States > Massachusetts > Middlesex County > Concord (0.04)
- North America > United States > California (0.04)
- Europe (0.04)
- Law (0.47)
- Information Technology (0.47)
QAMPARI: An Open-domain Question Answering Benchmark for Questions with Many Answers from Multiple Paragraphs
Amouyal, Samuel Joseph, Wolfson, Tomer, Rubin, Ohad, Yoran, Ori, Herzig, Jonathan, Berant, Jonathan
Existing benchmarks for open-domain question answering (ODQA) typically focus on questions whose answers can be extracted from a single paragraph. By contrast, many natural questions, such as "What players were drafted by the Brooklyn Nets?" have a list of answers. Answering such questions requires retrieving and reading from many passages, in a large corpus. We introduce QAMPARI, an ODQA benchmark, where question answers are lists of entities, spread across many paragraphs. We created QAMPARI by (a) generating questions with multiple answers from Wikipedia's knowledge graph and tables, (b) automatically pairing answers with supporting evidence in Wikipedia paragraphs, and (c) manually paraphrasing questions and validating each answer. We train ODQA models from the retrieve-and-read family and find that QAMPARI is challenging in terms of both passage retrieval and answer generation, reaching an F1 score of 32.8 at best. Our results highlight the need for developing ODQA models that handle a broad range of question types, including single and multi-answer questions.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- (12 more...)
- Research Report (0.69)
- Personal (0.46)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
Cybertrust: From Explainable to Actionable and Interpretable AI (AI2)
Galaitsi, Stephanie, Trump, Benjamin D., Keisler, Jeffrey M., Linkov, Igor, Kott, Alexander
To benefit from AI advances, users and operators of AI systems must have reason to trust it. Trust arises from multiple interactions, where predictable and desirable behavior is reinforced over time. Providing the system's users with some understanding of AI operations can support predictability, but forcing AI to explain itself risks constraining AI capabilities to only those reconcilable with human cognition. We argue that AI systems should be designed with features that build trust by bringing decision-analytic perspectives and formal tools into AI. Instead of trying to achieve explainable AI, we should develop interpretable and actionable AI. Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations. In doing so, it will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making and ensure broad benefits from deploying and advancing its computational capabilities.
- North America > United States > Massachusetts > Suffolk County > Boston (0.14)
- North America > United States > Maryland > Prince George's County > Adelphi (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Concord (0.04)
- Government > Military (1.00)
- Information Technology > Security & Privacy (0.69)
Autoencoding with XCSF
Preen, Richard J., Wilson, Stewart W., Bull, Larry
Autoencoders enable data dimensionality reduction and are a key component of many (deep) learning systems. This article explores the use of the XCSF online evolutionary reinforcement learning system to perform autoencoding. Initial results using a neural network representation and combining artificial evolution with stochastic gradient descent, suggest it is an effective approach to data reduction. The approach adaptively subdivides the input domain into local approximations that are simpler than a global neural network solution. By allowing the number of neurons in the autoencoders to evolve, this further enables the emergence of an ensemble of structurally heterogeneous solutions to cover the problem space. In this case, networks of differing complexity are typically seen to cover different areas of the problem space. Furthermore, the rate of gradient descent applied to each layer is tuned via self-adaptive mutation, thereby reducing the parameter optimisation task.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.05)
- Europe > Germany > Berlin (0.04)
- (6 more...)
A Digital Neuromorphic Architecture Efficiently Facilitating Complex Synaptic Response Functions Applied to Liquid State Machines
Smith, Michael R., Hill, Aaron J., Carlson, Kristofor D., Vineyard, Craig M., Donaldson, Jonathon, Follett, David R., Follett, Pamela L., Naegle, John H., James, Conrad D., Aimone, James B.
Information in neural networks is represented as weighted connections, or synapses, between neurons. This poses a problem as the primary computational bottleneck for neural networks is the vector-matrix multiply when inputs are multiplied by the neural network weights. Conventional processing architectures are not well suited for simulating neural networks, often requiring large amounts of energy and time. Additionally, synapses in biological neural networks are not binary connections, but exhibit a nonlinear response function as neurotransmitters are emitted and diffuse between neurons. Inspired by neuroscience principles, we present a digital neuromorphic architecture, the Spiking Temporal Processing Unit (STPU), capable of modeling arbitrary complex synaptic response functions without requiring additional hardware components. We consider the paradigm of spiking neurons with temporally coded information as opposed to non-spiking rate coded neurons used in most neural networks. In this paradigm we examine liquid state machines applied to speech recognition and show how a liquid state machine with temporal dynamics maps onto the STPU-demonstrating the flexibility and efficiency of the STPU for instantiating neural algorithms.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > Massachusetts > Middlesex County > Medford (0.04)
- North America > United States > Massachusetts > Middlesex County > Concord (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.46)